4 research outputs found
Learning long-range spatial dependencies with horizontal gated-recurrent units
Progress in deep learning has spawned great successes in many engineering
applications. As a prime example, convolutional neural networks, a type of
feedforward neural networks, are now approaching -- and sometimes even
surpassing -- human accuracy on a variety of visual recognition tasks. Here,
however, we show that these neural networks and their recent extensions
struggle in recognition tasks where co-dependent visual features must be
detected over long spatial ranges. We introduce the horizontal gated-recurrent
unit (hGRU) to learn intrinsic horizontal connections -- both within and across
feature columns. We demonstrate that a single hGRU layer matches or outperforms
all tested feedforward hierarchical baselines including state-of-the-art
architectures which have orders of magnitude more free parameters. We further
discuss the biological plausibility of the hGRU in comparison to anatomical
data from the visual cortex as well as human behavioral data on a classic
contour detection task.Comment: Published at NeurIPS 2018
https://papers.nips.cc/paper/7300-learning-long-range-spatial-dependencies-with-horizontal-gated-recurrent-unit
Adaptive recurrent vision performs zero-shot computation scaling to unseen difficulty levels
Humans solving algorithmic (or) reasoning problems typically exhibit solution
times that grow as a function of problem difficulty. Adaptive recurrent neural
networks have been shown to exhibit this property for various
language-processing tasks. However, little work has been performed to assess
whether such adaptive computation can also enable vision models to extrapolate
solutions beyond their training distribution's difficulty level, with prior
work focusing on very simple tasks. In this study, we investigate a critical
functional role of such adaptive processing using recurrent neural networks: to
dynamically scale computational resources conditional on input requirements
that allow for zero-shot generalization to novel difficulty levels not seen
during training using two challenging visual reasoning tasks: PathFinder and
Mazes. We combine convolutional recurrent neural networks (ConvRNNs) with a
learnable halting mechanism based on Graves (2016). We explore various
implementations of such adaptive ConvRNNs (AdRNNs) ranging from tying weights
across layers to more sophisticated biologically inspired recurrent networks
that possess lateral connections and gating. We show that 1) AdRNNs learn to
dynamically halt processing early (or late) to solve easier (or harder)
problems, 2) these RNNs zero-shot generalize to more difficult problem settings
not shown during training by dynamically increasing the number of recurrent
iterations at test time. Our study provides modeling evidence supporting the
hypothesis that recurrent processing enables the functional advantage of
adaptively allocating compute resources conditional on input requirements and
hence allowing generalization to harder difficulty levels of a visual reasoning
problem without training.Comment: 37th Conference on Neural Information Processing Systems (NeurIPS
2023
Subtle adversarial image manipulations influence both human and machine perception
Abstract Although artificial neural networks (ANNs) were inspired by the brain, ANNs exhibit a brittleness not generally observed in human perception. One shortcoming of ANNs is their susceptibility to adversarial perturbations—subtle modulations of natural images that result in changes to classification decisions, such as confidently mislabelling an image of an elephant, initially classified correctly, as a clock. In contrast, a human observer might well dismiss the perturbations as an innocuous imaging artifact. This phenomenon may point to a fundamental difference between human and machine perception, but it drives one to ask whether human sensitivity to adversarial perturbations might be revealed with appropriate behavioral measures. Here, we find that adversarial perturbations that fool ANNs similarly bias human choice. We further show that the effect is more likely driven by higher-order statistics of natural images to which both humans and ANNs are sensitive, rather than by the detailed architecture of the ANN